167 research outputs found
Approximate Convex Optimization by Online Game Playing
Lagrangian relaxation and approximate optimization algorithms have received
much attention in the last two decades. Typically, the running time of these
methods to obtain a approximate solution is proportional to
. Recently, Bienstock and Iyengar, following Nesterov,
gave an algorithm for fractional packing linear programs which runs in
iterations. The latter algorithm requires to solve a
convex quadratic program every iteration - an optimization subroutine which
dominates the theoretical running time.
We give an algorithm for convex programs with strictly convex constraints
which runs in time proportional to . The algorithm does NOT
require to solve any quadratic program, but uses gradient steps and elementary
operations only. Problems which have strictly convex constraints include
maximum entropy frequency estimation, portfolio optimization with loss risk
constraints, and various computational problems in signal processing.
As a side product, we also obtain a simpler version of Bienstock and
Iyengar's result for general linear programming, with similar running time.
We derive these algorithms using a new framework for deriving convex
optimization algorithms from online game playing algorithms, which may be of
independent interest
Almost Optimal Sublinear Time Algorithm for Semidefinite Programming
We present an algorithm for approximating semidefinite programs with running
time that is sublinear in the number of entries in the semidefinite instance.
We also present lower bounds that show our algorithm to have a nearly optimal
running time
Faster Rates for the Frank-Wolfe Method over Strongly-Convex Sets
The Frank-Wolfe method (a.k.a. conditional gradient algorithm) for smooth
optimization has regained much interest in recent years in the context of large
scale optimization and machine learning. A key advantage of the method is that
it avoids projections - the computational bottleneck in many applications -
replacing it by a linear optimization step. Despite this advantage, the known
convergence rates of the FW method fall behind standard first order methods for
most settings of interest. It is an active line of research to derive faster
linear optimization-based algorithms for various settings of convex
optimization.
In this paper we consider the special case of optimization over strongly
convex sets, for which we prove that the vanila FW method converges at a rate
of . This gives a quadratic improvement in convergence rate
compared to the general case, in which convergence is of the order
, and known to be tight. We show that various balls induced by
norms, Schatten norms and group norms are strongly convex on one hand
and on the other hand, linear optimization over these sets is straightforward
and admits a closed-form solution. We further show how several previous
fast-rate results for the FW method follow easily from our analysis
Universal MMSE Filtering With Logarithmic Adaptive Regret
We consider the problem of online estimation of a real-valued signal
corrupted by oblivious zero-mean noise using linear estimators. The estimator
is required to iteratively predict the underlying signal based on the current
and several last noisy observations, and its performance is measured by the
mean-square-error. We describe and analyze an algorithm for this task which: 1.
Achieves logarithmic adaptive regret against the best linear filter in
hindsight. This bound is assyptotically tight, and resolves the question of
Moon and Weissman [1]. 2. Runs in linear time in terms of the number of filter
coefficients. Previous constructions required at least quadratic time.Comment: 14 page
Variance-Reduced and Projection-Free Stochastic Optimization
The Frank-Wolfe optimization algorithm has recently regained popularity for
machine learning applications due to its projection-free property and its
ability to handle structured constraints. However, in the stochastic learning
setting, it is still relatively understudied compared to the gradient descent
counterpart. In this work, leveraging a recent variance reduction technique, we
propose two stochastic Frank-Wolfe variants which substantially improve
previous results in terms of the number of stochastic gradient evaluations
needed to achieve accuracy. For example, we improve from
to if the objective function
is smooth and strongly convex, and from to
if the objective function is smooth and
Lipschitz. The theoretical improvement is also observed in experiments on
real-world datasets for a multiclass classification application
- …